Comparing different acoustic modeling techniques for multilingual boosting
نویسندگان
چکیده
In this paper, we explore how different acoustic modeling techniques can benefit from data in languages other than the target language. We propose an algorithm to perform decision tree state clustering for the recently proposed Kullback-Leibler divergence based hidden Markov models (KL-HMM) and compare it to subspace Gaussian mixture modeling (SGMM). KLHMM can exploit multilingual information in the form of universal phoneme posterior features and SGMM benefits from a universal background model that can be trained on multilingual data. Taking the Greek SpeechDat(II) data as an example, we show that KL-HMM performs best for small amounts of target language data.
منابع مشابه
Boosting under-resourced speech recognizers by exploiting out-of-language data - case study on Afrikaans
Under-resourced speech recognizers may benefit from data in languages other than the target language. In this paper, we boost the performance of an Afrikaans speech recognizer by using already available data from other languages. To successfully exploit available multilingual resources, we use posterior features, estimated by multilayer perceptrons that are trained on similar languages. For two...
متن کاملLearning Methods in Multilingual Speech Recognition
One key issue in developing learning methods for multilingual acoustic modeling in large vocabulary automatic speech recognition (ASR) applications is to maximize the benefit of boosting the acoustic training data from multiple source languages while minimizing the negative effects of data impurity arising from language “mismatch”. In this paper, we introduce two learning methods, semiautomatic...
متن کاملA frame level boosting training scheme for acoustic modeling
Conventional Boosting algorithms for acoustic modeling have two notable weaknesses. (1) The objective function aims to minimize utterance error rate, though the goal for most speech recognition systems is to reduce word error rate. (2) During Boosting training, an utterance is treated as a unit for resampling and each frame within the same utterance is assigned equal weight. Intuitively, the fr...
متن کاملPronunciation and Acoustic Model Adaptation for Improving Multilingual Speech Recognition
In this paper, we address the importance of pronunciation and acoustic model adaptation in multilingual speech recognition. When aiming at modeling several languages simultaneously, the degree of speaker and language variability is even greater than when concentrating on only one language. To compensate the pronunciation variability across various speaker, bi-lingual pronunciation modeling is p...
متن کاملTowards multilingual interoperability in automatic speech recognition
In this communication, we address multilingual interoperability aspects in speech recognition. After giving a tentative definition of multilingual interoperability, we discuss speech recognition components and their language-specific aspects. We give a sample overview of past multilingual speech recognition research and development across different speaking styles (read, prepared and conversati...
متن کامل